Three Benchmarks for Distributional Approaches to Natural Language Syntax
نویسنده
چکیده
Human language abilities are far richer than what is represented in the kinds of monolingual corpora that are standardly used to evaluate statistical models of language learning. This article summarizes a series of findings from language acquisition, crosslanguage typology, and language processing, that illustrate the challenges that any serious model of natural language syntax must meet. Even a putative ideal statistical learner of cooccurrences in corpora will struggle to meet the challenges of complexity, crosslanguage consistency, and causality, unless it is able to take advantage of the rich representational primitives motivated by linguistics and psycholinguistics.
منابع مشابه
Crosslingual and Multilingual Construction of Syntax-Based Vector Space Models
Syntax-based distributional models of lexical semantics provide a flexible and linguistically adequate representation of co-occurrence information. However, their construction requires large, accurately parsed corpora, which are unavailable for most languages. In this paper, we develop a number of methods to overcome this obstacle. We describe (a) a crosslingual approach that constructs a synta...
متن کاملDistributed representations for compositional semantics
The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches—meaning distributed representations that exploit co-occurrence statistics of large corpora—have proved popular and successful across a number of tasks. H...
متن کاملEfficient, Correct, Unsupervised Learning for Context-Sensitive Languages
A central problem for NLP is grammar induction: the development of unsupervised learning algorithms for syntax. In this paper we present a lattice-theoretic representation for natural language syntax, called Distributional Lattice Grammars. These representations are objective or empiricist, based on a generalisation of distributional learning, and are capable of representing all regular languag...
متن کاملQuantifier Scope in Categorical Compositional Distributional Semantics
Categorical Compositional Distributional semantics (CCDS) adds compositionality to distributional semantics via a functorial passage from the syntax to the semantics of natural language [4]. Both the syntax and the semantics are represented by compact closed categories. The claim is that regardless of how complex the structure of a sentence can be and what bizarre forms the words therein can ta...
متن کاملTowards Syntax-aware Compositional Distributional Semantic Models
Compositional Distributional Semantics Models (CDSMs) are traditionally seen as an entire different world with respect to Tree Kernels (TKs). In this paper, we show that under a suitable regime these two approaches can be regarded as the same and, thus, structural information and distributional semantics can successfully cooperate in CSDMs for NLP tasks. Leveraging on distributed trees, we pres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004